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Abstract 

We determine under which conditions certain natural models of random constraint satisfac- 
tion problems have sharp thresholds of satisfiability. These models include graph and hypergraph 
homomorphism, the (d, k, t)-model, and binary constraint satisfaction problems with domain size 
three. 

1 Introduction 

Random 3-SAT and its generalizations have been studied intensively for the past decade or so (see 
e.g. [1, 5, 7, 16, 27, 2, 6, 12, 15, 34, 8]). One of the most interesting things about these models, and 
arguably the main reason that most people study them, is that many of them exhibit what is called 
a sharp threshold of satisfiability^ , a critical clause-density at which the random problem suddenly 
moves from being almost surely^ satisfiable to almost surely unsatisfiable. Most of the work on these 
problems is, at least implicitly, an attempt to determine the precise locations of their thresholds. At 
this point, these locations are known only for a handful of the problems, such as [2, 6, 12, 15, 34, 8]. 
Just proving the existence of a sharp threshold for random 3-SAT was considered a major breakthrough 
by Friedgut[16]. The vast majority of these generalizations appear to have sharp thresholds, but there 
are exceptions which are said to have coarse thresholds^. 

The ultimate goal of the present line of enquiry is to determine precisely which of these models 
have sharp thresholds, but this appears to be quite difficult; in Section 2 we show that it is at least 
as difficult as determining the location of the threshold for 3-colourability, something that has been 
sought after for more than 50 years (see e.g. [13, 30, 11]). A more fundamental goal is to obtain a better 
understanding of what can cause some problems to have coarse thresholds rather than sharp ones. 

Molloy [31] and independently Creignou and Daude[9] introduced a wide family of models for random 
constraint satisfaction problems which includes 3-SAT and many of its generalizations. This permits 
us to study them under a common umbrella, rather than one-at-a-time. Molloy determined precisely 
which models from this family have any threshold at all ([9] provides the same result for those models 
with domain^ size 2). But he left open the much more important question of which models have 
sharp thresholds. In this paper, we begin to address this question. We answer it for two of the most 
natural subfamilies - the so-called {d, k, t)-family^ (Theorem 3), and the family of graph and hypergraph 



^ Defined formally below. 
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homomorphism problems (Theorem 5). We also study binary constraint satisfaction problems with 
domain size 3. 

The standard example of a problem with a coarse threshold is 2-colourability. Here, there is a 
coarse threshold precisely because unsatisfiability (i.e. non- 2-colourability) can be caused only by the 
presence of odd cycles. Roughly speaking, Friedgut's theorem[16] implies that a problem exhibits a 
coarse threshold iff unsatisfiability is approximately equivalent to having one of a set of unicyclic^ 
subproblems. It is not hard to see that if there are unsatisfiable unicyclic instances of a problem then 
that problem exhibits a coarse threshold (or exhibits no threshold at all). This makes it quite natural 
to pose the following rule-of-thumb: 

Hypothesis A: If a random model from the family in [31] is such that: (a) it exhibits a threshold, 

and (h) every unicyclic instance is satisfiable, then that threshold is sharp. 

However, reality is not that simple. [31] presents a counterexample to Hypothesis A; others are 
presented in this paper. Nevertheless, the hypothesis holds for certain subfamilies of models. Creignou 
and Daudc[9] conjectured that Hypothesis A holds for problems with domain-size two. Special cases 
of this conjecture were proven by Istrate[25] and independently Creignou and Daude[10]; the proofs of 
each paper can be extended to cover the entire conjecture. Theorems 3 and 5 in this paper show that 
Hypothesis A holds for the (d, fc, t)-models and for homomorphisms to connected graphs. 

In general, coarse thresholds can be caused by much more subtle and insidious reasons than unsat- 
isfiable unicyclic instances. In this paper we begin to understand some of these reasons by focusing on 
the case where the constraint size is two and the domain size is three (a natural next step after the 
well- understood domain-size- two case). In this paper, we identify a particular subtle property that must 
hold whenever Hypothesis A fails (Theorem 17). If we permit either greater domain sizes or greater 
constraint sizes then this is no longer true. 

1.1 The random models 

In our setting, the variables of a constraint satisfaction problem (CSP) all have the same domain of 
permissablc values, {1, d}, and all constraints will have size k, for some fixed integers d, k. Given a 
/c-tuple of variables, {xi, ...,Xk), a restriction on (xi, ■■■,Xk) is a /c-tuple of values R = (i5i, ...i5fc) where 
1 < 5i < d ior each i. For each fc-tuple {x\, ...,Xk), the set of restrictions on that fc-tuple is called 
a constraint. The empty constraint is the constraint which contains no restrictions. We say that an 
assignment of values to the variables of a constraint C satisfies C if that assignment is not one of the 
restrictions in C. An assignment of values to all variables in a CSP satisfies that CSP if every constraint 
is simultaneously satisfied. A CSP is satisfiable if it has such a satisfying assignment. 

It will be convenient to consider a set of canonical variables Xi,...,Xk which are used only to describe 
the "pattern" of a constraint. These canonical variables are not variables of the actual CSP. For any d, k 
there are d'^ possible restrictions and 2** possible constraints over the k canonical variables. We denote 
this set of constraints as C*'*^. For our random model, one begins by specifying a particular probability 
distribution, V over C'^'*'. We use supp(P) to denote the support of i.e. the set of constraints C 
with 'P{C) > 0. Different choices of V give rise to different instances of the model. 

We now define our random models. The "G^.m" model, where the number of constraints is fixed to 
be M, is the most common. But in this paper, it will be much more convenient to focus on the "G„,p" 
model where each fc-tuple of variables is chosen independently with probability p = c/n'^~^ to receive 
a constraint. The two models arc, in most respects, equivalent when M ~ {c/k\)n. In particular, it is 
straightforward to show that one exhibits a sharp threshold iff the other does. 

The CSPn,p{'P) Model: Specify n,p and V (typically p = c/nf'~^ for some constant c; note that 
V implicitly specifies d,k). First choose a random fc-uniform hypergraph on n variables where each of 

the (^) potential hyperedgcs is selected with probability p. Next, for each hypcrcdge e, we choose a 
constraint on the k variables of e as follows: we take a random permutation from the k variables onto 
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{Xi, ...,Xk} and then we select a random constraint according to V and map it onto the k variables. 

A property holds almost surely (a.s) if the limit as n ^ oo of it holding is 1. In [9, 31] it was 
shown that for every V, either: (i) CSPn,p{V) is a.s. satisfiable for every c > 0, (ii) CSPn^p{V) is a.s. 
unsatisfiable for every c > 0, or (iii) there is some c\ < C2 such that CSPn,p{V) is a.s. satisfiable for 
every < c < ci and CSPn,p{V) is a.s. unsatisfiable for every c > C2. We say that CSPn,p{V) has 
a sharp threshold of satis fi,ahility if there is some positive-valued fmiction c(n) = 0(1) such that for 
every e > 0, if p = (1 — e)c{n)/n''~^ then CSPn,p{P) is a.s. satisfiable and if p = (1 + e)c{n)/n''~^ 
then CSPn,p{V) is a.s. unsatisfiable. This is often abbreviated to just sharp threshold. We say that 
CSPn,p{V) has a coarse threshold if for all c in some interval ci(n) < c < C2(n), CSPn,p{P) is neither 
a.s. satisfiable nor a.s. unsatisfiable. It is easy to see that every V satisfying case (iii) above must have 
either a coarse threshold or a sharp threshold. 

Each fc-tuple of vertices can have at most one constraint in CSPn^p{V)- When applying Priedgut's 
theorem, it will be convenient to relax this condition, and allow fc-tuplcs to possibly receive multiple 
constraints. Thus up to fc! x |supp(P)| constraints can appear on a fc-tuple of variables. 

The CSPn,p{V) Model: Specify n,p and V. For each of the n{n — l)...(n — A; + 1) ordered fc-tuples 
of variables and each constraint C G supp(7'), we assign C to the ordered fc-tuple with probability 
V{C) xp/k\. 

Note that the expected total number of constraints is the same under each model. Furthermore, 
it is easy to calculate that the probability of at least one fc-tuple receiving more than one constraint 
in CSPn,p{'P) is for fc > 3, o(l) and for fc = 2, an absolute constant < a < 1. It follows that if a 
property holds a.s. in CSPn,p{P) then it holds a.s. in CSPn,p{'P). As a corollary, we have: 

Lemma 1 If CSPn^p{V) has a sharp threshold then so does CSPn^piV). The reverse is true for fc > 3. 

So for the remainder of the paper, whenever we wish to prove that CSPn,p{'P) has a sharp threshold, 
we will work in the CSPn,p{V) model. 

We often focus on the constraint hypergraph of a CSP; i.e. the hypergraph whose vertices are the 
variables and whose edges are the tuples of variables that have constraints. A tree-CSP is a CSP whose 
constraint hypergraph is a hypertrce. A CSP is unicyclic if its constraint hypergraph is unicylic; i.e. 
has exactly one cycle. (Hypertrce and cycle are defined below). 

Fi is said to be a sub- CSP of F2 if every variable of Fi is a variable of F2 and every constraint of 
Fi is a constraint of F2. 

We close this subsection with some hypergraph definitions. A hypergraph consists of a set of vertices 
and a set of hyperedges, where each hyperedge is a collection of vertices. If every hyperedge has size 
exactly fc then the hypergraph is k -uniform. In a simple hypergraph, no vertex appears twice in any one 
hyperedge, and no two edges are identical. So, for example, the constraint hypergraph of CSPn,p{V) is 
simple, but the constraint hypergraph of CSPn.piV) may have multiple edges. Neither model permits 
multiple copies of a vertex in a single edge, but such edges are possible when we discuss hypergraph 
homomorphism problems. The edge {v,v, ...,v) is called a loop. 

A walk P of length r is a sequence of r hyperedges and r -\- 1 vertices (wq, ei, wi, 62, U2 ■ • ■ , e^, Vr) 
such that ei contains both Vi-i and Vi. A walk is a path if the Vi are distinct. A walk is a cycle of size 
r if for z = 1, . . . , r the Vi and Ci are distinct, and vq = Vr- The distance from a vertex w to a vertex v 
is the minimum r such that there exists a walk of length r, (v, e^^vi, . . . , e^, u); the distance of a vertex 
from itself is defined to be 0. The distance from a vertex t; to a set of vertices is the minimum distance 
from V to any vertex in the set. A hypergraph is a hypertrce if it has no cycles and it is connected. 

By contracting two vertices u and v into a new vertex w, we mean (i) adding a new vertex w to the 
set of the; vertices, (ii) replacing u and v in every hyperedge by w, and (iii) removing u and v. Note 
that this may result in a hyperedge containing multiple copies of w. 
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1.2 Two special families 



The ultimate goal of this research is to charaeterize all distributions V for whieh C SPn.piV) exhibits a 
sharp threshold. However, in section 2, we will show that this is very difficult by proving the following: 

Observation 2 // one can determine which distributions V yield sharp thresholds for CSPn,p{P) then 

one can determine the locations of each of those thresholds. 

Determining the locations of some of these thresholds, eg the 3-SAT threshold or the 3-colourability 
threshold, is notoriously difficult (see [30] for a survey on those two thresholds). So Observation 2 
suggests that we should set our sights lower and focus on some important classes of distributions. 

Perhaps the most natural choice for P is the distribution obtained by selecting each of the d'^ possible 
restrictions independently with probability q for some constant q. However, as noted in [4], every such 
choice of V yields a model that is a.s. unsatisfiable for any non-trivial choice of p. So this is a rather 
uninteresting family of models, particularly as far as the study of thresholds goes. 

The next most natural choice for V is to fix t, the number of restrictions per clause, and to make 
every constraint with exactly t restrictions equally likely. (Note that for d = 2, t = 1 this yields random 
fc-SAT.) This is often called the {d,k, t)-model and has received a great deal of study, both from a 
theoretical perspective [28, 32] and from experimentalists (see [19] for a survey of many such studies). 
In [4] it is shown that when t > d*^~^ , this model is problematic in the same way as the previously 
mentioned one, as it is a.s. unsatisfiable even for values of p = o(l/n^~^) (i.e. when the number of 
constraints is o(ri)). However, it was proven in [19] that for every 1 <t < d^~^ , the (d, fc,t)-model does 
not have that problem. One of the main contributions of this paper is to show that for this case, the 
model exhibits a sharp threshold: 

Theorem 3 For every d,k>2 and every 1 <t < d'^~^ , the {d,k,t)-model has a sharp threshold. 

Note that this generalizes the well-known result that fc-SAT has a sharp threshold ([6, 20] for k = 2; 
[16] for fc > 3), as can be seen by setting d = 2,t = 1. 

From a different perspective, it is quite natural to consider the case where every constraint is 
identical, i.e. |supp(7^)] = 1. It is not hard to see that every such problem is equivalent to a hypergraph 
homomorphism problem, as defined below: 

For two fc-uniform hypergraphs, G, H, a homomorphism from G to is a mapping h from V{G) 
to V{H) such that for each edge {vi,V2, ■ ■ ■ , Vf.) of G, {h{vi) , h{v2) , ■ ■ ■ , h{vk)) is an edge of H. We 
say that G is hom,omorphic to H, if there exists such a homomorphism. When k = 2 and H is the 
complete graph with no loops, we are simply asking whether G has a ]_ff |-colouring. Homomorphisms 
are an important generalization of graph colouring (see e.g. [24]). They are often also referred to as 
if-colourings (e.g. [23, 21]). 

Suppose that if is a fixed undirected fc-uniform hypergraph, and G is a random fc-uniform hyper- 
graph on n vertices where each of the (^) potential hyperedges is selected with probability p. Set d to 
be equal to the number of vertices in H and define a constraint C with domain size d and constraint 
size fc by saying that C permits Xi = Si, ...,Xfe = Sk iff (^i, ■■■,Sk) is a hyperedge of H. Treat each 
vertex of G as a variable with domain {1, .., rf} and assign C to each hyperedge of G. We call this the 
H -homomorphism problem. 

Thus we have an instance of CSPn,p{P) where G is the only constraint in supp('P) and furthermore 
G is symmetric under permutations of the canonical variables; in other words, all constraints are 
identical even under permutations of variables. It is easy to see that every such P corresponds to a 
homomorphism problem; just take H to be the hypergraph where (5i, 5/c) is a hyperedge iff G permits 
Xi = 5i, ...,Xk = 5k- Note that here a hyperedge in H may contain multiple copies of a vertex. 
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Thus, these iJ-homomorphism problems are not only important as a fundamental graph problem, 
but also because they form a very natural subclass of our family of random CSP models. In this paper, 
we prove that Hypothesis A holds for every connected undirected H. 

It is easy to see that if H has a loop {S,S, 5) then every hypergraph is trivially homomorphic to H 
(just map every vertex to 5); so the il-homomorphism problem has no threshold at all. The other trivial 
case is where H has no hyperedges at all and so no non-trivial hypergraph has an il-homomorphism. 

Lemma 4 Suppose that H is a nontrivial k-uniform hypergraph with no loops. We have the following: 

(a) For k > 3, every unicylic k-uniform hypergraph is homomorphic to a single hyperedge, and hence 
to H. 

(b) For k = 2: if the triangle is homomorphic to H, then so is every unicyclic graph; and the triangle 
is homomorphic to H iff H contains a triangle. 

Proof. To prove part (1), let (wq, ei, ui, 62, W2 • • • ,er,uo) be the unique cycle of the hypergraph, and 
let {wq, . . . ,Wk-i) be a single hyperedge. Define h(vi) = W(^i mod 2)) for every < i < r — 2 and 
h{vr-i) = W2. It is easy to see that one can extend /i to a homomorphism from the unicyclic hypergraph 
to H. 

Part (2) easily follows from the easy and well-known fact that every cycle is homomorphic to the 
triangle, and the triangle is not homomorphic to any cycle of size greater that 3. ■ 

From Lemma 4 we conclude that proving that Hypothesis A holds whenever H is connected and 
undirected is equivalent to proving: 

Theorem 5 If H is a connected undirected loopless k-uniform hypergraph with at least one edge, then 
the H -homomorphism problem has a sharp threshold iff (a) k > 3 or (h) k = 2 and H contains a 
triangle. 

Note that this generalizes the well-known result that c-colour ability has a sharp threshold for c > 3[3] , 

as can be seen by setting k = 2 and taking H = Kc- 

We do not have a strong feeling as to whether the "connected" condition is necessary here; we 
discuss the possibility of extending Theorem 5 to disconnected graphs in Section 3. In section 3.2, we 
provide a disconnected directed graph H for which the iJ-homomorphism problem is a counterexample 
to the analogue of Hypothesis A in the CSPn,p{V) model but not in the CSPn,p{P) model. 

1.3 Tools 

Our main tool is distilled from Fricdgut's main theorem in [16]. Friedgut reported to us[private com- 
munication] that his proof can be adapted to the setting of this paper; for completeness, the first 
author worked out the details of such an extension in [22] as they did not appear in print. To provide 
Friedgut's theorem for CSP's in its full power instead of being restricted to the unsatisfiability property, 
we consider, as Friedgut did, properties that arc preserved under constraint addition; such properties 
arc called monotone. A property on CSP's is called monotone symmetric if it is monotone and invariant 
under CSP automorphisms. For a property A, An denotes the restriction of A on CSP's with exactly n 
variables. Roughly speaking, Friedgut's theorem says that for a value of p that is "within" the coarse 
threshold, there is a constant sized instance M such that r < Pr[M C CSPn,p{'P)] < 1 — r for some 
constant t which does not depend on n, and adding M to our random CSP boosts the probability of 
being in A by at least 2q! > 0, whereas adding a linear number of new random constraints only boosts 
it by at most a. First, we must formalize what we mean by "adding Af". Given two CSP's M, F where 
M has r variables, and F has at least r variables, we define F©M to be the CSP obtained by choosing 
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a random r-tuple of variables in F and then adding M on those r variables. Now we can formally state 
an adaptation of Friedgut's theorem to the setting of this paper. A proof can be found in [22]: 



Theorem 6 Let A he a monotone symmetric property in CSPn,p{V) which has a coarse threshold. 
There exist p = p{n), T,a,€ > 0, and a CSP M whose constraints are chosen from supp(P) such that 
for an infinite number of n: 



(a) a < Pr[c5P„,p(P) G A] < 1 - 2a. 

(b) Pr[C5P„,(i+,)p(P) e ^] < 1 - 2a. 

(c) PT[CSPn,p{V) ®M GA]>l-a. 

(d) T < Pr[M C CSPnA'P)] < 1 - T. 

Since in our setting p(n) = c{n)/n''~^ where c(n) = 9{1), Theorem 6(d) implies that M has exactly 



Corollary 7 For any V , if CSPn,p{V) has a coarse threshold of satisfiability then there exist p = p(n), 
a, e > 0, and a unicyclic CSP M on a constant number of variables whose constraints are chosen from 
supp('P) such that: 



(b) Pr[C5P„_(i_|_j)p('P) is unsatisfiable] < 1 — 2a. 

(c) Pr[(75P„,p(P) ®M is unsatisfiable] > 1 - a. 

Our next tool proves some properties for local parts of a random CSP. 

Lemma 8 Suppose thatp < cn^~'' for some positive constant c, and let G be an instance ofCSPn,p{T'). 
Let t be a positive constant integer and choose a set Toft random variables. Then for every e > 0, and 
integer r > there exists an integer L{c, t, r, e) such that with probability at least 1 — e: 

(i): No constraint of G contains more than one variable ofT. 

(a): G induces a forest on the set of the variables that are of distance at most r from T. 
(Hi): There are at most L variables that are of distance at most r from T. 

Proof. Let Ei, E2, and E3 denote the events (i), (m), and (Hi) respectively. Trivially 



The expected number of the cycles of size at most 2r which contain at least one variable in T is at most 



one cycle. 



(a) a < Pv[CSPn,p{V) is unsatisfiable] < 1 — 2a. 




(1) 



.2r 
n=2 



n' 



ik—i—l^i 



p\ Thus 



2r 



2tr{l + c) 



2r 



I>v[E2]>l-tY, 



> 1 - 



n 



= 1-0(1). 



(2) 
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The expected number of the variables in a distance of at most r from T is at most tJ2i=i n^'^ 
So by Chebychev's inequaUty, for sufficiently large L: 

The lemma follows from (1), (2), and (3). ■ 

Our third tool is easily proven with a straightforward first moment calculation and concentration 
argument (via e.g. the second moment method or Talagrand's inequality); we omit the details. 

Lemma 9 Let T he a tree-CSP whose constraints are in supp('P). There exists z = o{n^~'') such that 
a.s. CSPn,z{V) contains T as a sub-CSP. 



2 Difficulty 

Here wc prove Observation 2, thus showing that characterizing those distributions V for which C S Pn,p{'P) 
has a sharp threshold is at least as difficult as determining the location of all such thresholds. 

Suppose, for example, that one wanted to know the 3-colourability threshold; i.e. the value c = c(n) 
such that for all e > 0, x(G'„^p=c(n)-e) is a.s. 3-colourable and x(G„_p=c(n)+e) is a.s. not 3-colourable. 
(The existence of this threshold was proven in [3].) We will construct a family of distributions V 
such that determining which of those distributions have sharp thresholds is sufficient to determine the 
3-colourability threshold. 

We set = 5 and fc = 2, and define two constraints by listing their pairs of forbidden values: 

Ci = {(4,4),(5,5)}U({l,2,3}x{4,5})U({4,5}x{l,2,3}), 

C2 = {(l,l),(2,2),(3,3)}U({l,2,3}x{4,5})U({4,5}x{l,2,3}). 

Note that each constraint forces the endpoints of every edge to take values that are either both in 

{1, 2, 3} or both in {4, 5}. A Ci constraint says that they have to be different values if they are both in 
{4,5}. A C'2 constraint says that they have to be different values if they are both in {1,2,3}. 

We let Ci occur with probability q and C2 occur with probability 1 — q inP. Set c(g) = {1 — q)/q. 

Fact 10 (a) If Gn,p=c{q)/n is a.s. 3-colourable, then CSPn,p{V) has a sharp threshold. 

(b) If there is some e > such that Gn,p={c{q)-€)/n is a.s. not 3-colourable, then CSPn^piV) has a 
coarse threshold. 



Thus, determining the type of the threshold for all such models CSPn.piP) requires the knowledge 
of for which values of c, G'(n, ^) is a.s. 3-colourablc, and for which values it is a.s. not 3-colourablc. 

Proof Choose our CSPn^piV) by first taking Gn,p=c/n and then setting each edge to be C\ with 
probability q and C2 otherwise. Let Gi, G2 be the subgraphs formed by the edges chosen to be Ci, C2 
respectively. If c < 1 then all components of Gn^p=c/n are trees or unicyclcs and the CSP is trivially 
satisfiable. So we can focus on the range c > 1 and we let T denote the giant component of G^^p^c/n- 
Note that the variables of T must either all take values from {1, 2, 3} or all take values from {4, 5}. 

Case 1: c > i. Then Gi is equivalent to Gn,p=ci/n for ci = eg > 1 and it follows easily that a.s. Gi 
contains a giant component which is not 2-colourable. This giant component is a subgraph of T and 
so the variables of T must all take values from {1,2,3}. It follows that the CSP is satisfiable iff G2 is 
3-colourable. Note that G2 is equivalent to Gn,p=c2/n for C2 = c x (1 — q') > c{q). 

Case 2: c < Then Gi is equivalent to Gn,p=ci/n for Ci = cq < 1 and G2 is equivalent to Gn^p=c2/n 
for C2 = c X (1 — g) < c{q). If G2 is a.s. 3-colourable then the CSP is a.s. satisfiable. If G2 is a.s. 
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not 3-colourable then the CSP is satisfiable iff T is 2-colourable; i.e., if Gi does not have an odd cycle 
lying within T. Since ci < 1, Gi is a random graph with edge-density below the critical point. So 
with probability at least some positive constant, it has no odd cycle at all, let alone one lying within 
T. On the other hand, the distribution of the number of triangles in Gi is asymptotically Poisson with 
mean cf/6 (see, eg Section 3.3 of [26]), and so the probability of containing at least one triangle tends 
to 1 - e-"^/^. If wc condition on u,v,w forming a triangle in Gi, then the probability that they are 
all in the giant component T is easily seen to not increase, and so is at least {\T\/nY which tends to 
a positive constant. Therefore, the probability that Gi has an odd cycle in T is at least some positive 
constant. This implies that the CSP is neither a.s. satisfiable nor a.s. unsatisfiable. 

Fact 10 now follows. If Gn.p=c(q)/n is a.s. 3-colourable, then CSPn,p{V) has a sharp threshold which 
lies somewhere above If there is some e > such that Gn^p-(c(q)-e)/n is a.s. not 3-colourable, then 
for all i — < c < I , we are in Case 2 where C2 > c{q) — e and so G2 is a.s. not 3-colourable. In that 
range of c, CSPn^p^c/nCP) is neither a.s. satisfiable nor a.s. unsatisfiable. So CSPn,p{V) has a coarse 
threshold running from i — (5 to ^ for some S > €/{l — q). □ 

It is straightforward to adapt this example so that, instead of 3-colourability, we use any model 
CSPn,p{V) which has a sharp threshold. Suppose that V is over constraints of size k with domain-size 
d. We will create a distribution V' over constraints of size k and with domain-size d + 2. All constraints 
will enforce: 

(1) All k variables take values in {1, ...,d} or all k variables take values in {d+l,d+ 2}. 
Constraint C* also enforces: 

(2) The first two variables cannot both be d -|- 1 and they cannot both he d-\-2 

For each constraint Ci S supp(7') we have a constraint C[ which has the same restrictions as Gi, and 
also enforces (1). We set V'lc'i) = (1 - q)V{C^) and wc set V'{C*) = q. 

Note that if V is simply the 3-colouring CSP then this yields the example from above. Similar 
reasoning to that above shows that we cannot know which values of q yield a sharp threshold for V' 
without knowing the location of the threshold for V. This yields Observation 2. 

3 Homomorphisms 

In this section, we prove Theorem 5 which concerns i?-homomorphisms. Let G^ ^ denote the random 
fc-uniform hypergraph on n vertices where each fc-tuple is present as a hyperedge with probability p. 

We begin with a technical lemma: 

Lemma 11 Let H he a connected graph which contains a triangle. Let u he a vertex of H and M he a 
unicyclic graph with unique cycle C. Denote the vertices of M in a distance of exactly r > \V{H) \ + 3 
from C by U. There is a homomorphism from M to H such that all vertices in U are mapped to u. 

Proof. Let /i be a homomorphism from G to the triangle {vi,V2, V3) of H. Observe that for i = 1,2, 3, 
there exist walks (vi =)wi,o, • • • ,'Vi_r{= u) of length exactly r in H. Let w be a vertex in M in the 
distance of j < r from G, and w' be the vertex of G which has the distance j from w. Extend h by 
assigning h{w) = Vij where h{w') = Vi. Observe that his a partial homomorphism from M to H which 
maps every vertex in U to u. Trivially h can be extended to a homomorphism from M to H. ■ 

Proof of Theorem 5. Let H be some fc-uniform hypergraph, and assume that the iJ-homomorphism 
problem has a coarse threshold. Let M,p,a,e be as guaranteed by Corollary 7. In this setting, M is 



8 



a /c-uniform unicyclic hypergraph, such that adding M to p boosts the probabihty of not having a 
homomorphism to H by at least 2a. 

Our strategy will be to show that adding M to p does not increase the probability of not having 
a homomorphism to H by more than adding a copy of G^ ^ for some z{n) = o(n^~'^). We are assuming 
that the former boosts that probability to at least 1 — a and thus so must the latter. But that will 
contradict Corollary 7(b). To show this, we will construct a hypertree T such that the probability that 
G^ p ffi M has no homomorphism to H is at most the probability that G^ p ® T has no homomorphism 
to H. Then we simply apply Lemma 9 to obtain our desired z. 

We begin with the case fc > 3. 

Consider G = G^ p ® M. Let M+ be the subgraph of G consisting of all hyperedges that contain 

at least one vertex of M (and, of course all vertices in those hyperedges); in other words, Af+ is the 
subhypergraph induced by the vertices of M and all their neighbours. Lemma 8 implies that there is 
some constant L such that, defining E to be the event that "M+ is unicyclic and has at most L vertices, 
no hyperedgc of Gfj p contains more than one vertex of M, and the vertices of distance at most 2 from 

M+ induce a forest", we have Pr{E) > 1 — ^• 

Since M is unicylic and fc > 3, by Lemma 4(a) there exists a homomorphism h from M to a single 
edge, say {vi, . . . ,Vk)- Let hi be the set of the vertices in M that are mapped by h to Vi. Obtain the 
hypergraph G' from G by (i) removing all edges in M: (ii) contracting all of the vertices in hi into one 
single new vertex Uj, for each 1 < i < fc; (iii) adding the single hyperedgc (ui, . . . , U}.). 

Suppose that h' is a homomorphism from G' to H. Then a mapping from the vertices of G to 
the vertices of H which maps every vertex in G — M to h'{v), and every vertex in hi to h'{ui) is a 
homomorphism from G to H. Thus, if G' is homomorphic to H then so is G. 

We define our hypertree T as follows: T has a hyperedgc {ti,...,tk), and each ti lies in L other 
hyperedges. Only t\,...,tk lie in more than one edge of T. Thus, T has k + k{k — l)L vertices and kL + 1 
hyperedges. If E holds, then the subgraph of G' induced by all edges containing {ui, . . . , Uk} forms a 
tree; note that it is a subtree of T. It follows that G^ ^ © T is at least as likely to be non-homomorphic 
to H as G' is, so: 

Pi'[G'n,p © ^ is not homomorphic to H] < Pr[G^ p © M is not homomorphic to H\E] + Pr(E) 

< Pr [G„_p © T is not homomorphic to H] + —. 

By Lemma 9, there is some z ~ o{n) such that increasing p by an additional z a.s. results in the 
addition of a copy of T. Thus for every e > 0: 

Pr[G^ p ® T is not homomorphic to H] < Pr[G^_(i_,_^)p is not homomorphic to H] 

which yields a contradiction to Corollary 7(b). 

This proves the case where fc > 3, so we now turn to the case fc = 2. li H contains no triangle, then 
is not homomorphic to H. Thus, forms a unicyclic unsatisfiable CSP using the if-colouring 
constraints and so we do not have a sharp threshold, since for any < c < ^:(i) with probability at 
least some positive constant, G„ p=c/n is a forest, and hence has a homomorphism to H; and (ii) with 
probability at least some positive constant, Gn,p=c/n contains a triangle and hence has no homomor- 
phism to H. So we will focus on graphs H that contain a triangle. Our proof follows along the same 
lines as the case fc > 3, but is complicated a bit since we can no longer assume that M is homomorphic 
to a single edge. We only highlight the differences. 

Define to be the subgraph of G = G^ p © M induced by all vertices within distance r = 
|y(iJ)| + |y(M)| + 3 of the unique cycle of M. By Lemma 8 there is some constant L such that with 

probability at least 1 — §: is unicyclic and has at most L vertices, no hyperedgc of Gjj p contains 
more than one vertex of M, and the vertices of distance at most 2 from M+ induce a forest. 
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Define U to be the set of vertices of G that are of distance exactly r = + |V^(M)| + 3 from 

the unique cycle of M. Consider any vertex u & H. By Lemma 11, if M+ is unicyclic then there is a 
homomorphism from M"*" to H such that all vertices in U are mapped to u. 

Obtain the graph G' from G by (i) removing all of the vertices of distance less than r from the unique 
cycle of M, and (ii) contracting U into a single new vertex u. Suppose that h' is a homomorphism from 
G' to H . Then by the previous paragraph, h' can be extended to a homomorphism from G to H where 
each vertex v € V(G') — u is mapped to h'{v), and every vertex in U is mapped to h'{u). Thus, if G' 
is homomorphic to H then so is G. 

We now define T to be the tree which consists of a vertex adjacent to L leaves. Since the degree 
of u in G' is at most L, and using the fact that all vertices of M arc deleted when forming G' (here is 
where we require r > |M|), the rest now follows as in the fc > 3 case. □ 

3.1 Disconnected Graphs 

In this subsection we discuss the possibilities of extending Theorem 5 to the case where H is discon- 
nected. We will focus on graphs, i.e. the k = 2 case. 

When considering disconnected graphs, it is helpful to note that if the i7'-homomorphism problem 

has a sharp threshold for every component H' of H, then the i?-homomorphism problem has a sharp 
threshold. In fact, it is simply the smallest of the thresholds for its components. To see this, note first 
that for each component H', since the iJ'-homomorphism problem has a sharp threshold, every tree 
or imicyclic graph must have a homomorphism to H' . Gn^p=c/n ^--S- has at most one component with 
more than one cycle (i.e. a giant component). So a.s. Gn,p is homomorphic to H iff either (i) there is 
no giant component or (ii) the giant component is homomorphic to H. Since the giant component is 
connected, it is homomorphic to H iff it is homomorphic to at least one component of H. That giant 
component will a.s. be homomorphic to at least one component of H iff it is a.s. homomorphic to the 
one with the smallest satisfiability threshold. 

We now show that if there is at least one graph H such that Hypothesis A does not hold for the 
_ff-homomorphism problem, then there must be such a graph with two components: a triangle and a 
graph that is triangle-free and not 3-colourable. 

Assume that Hypothesis A does not hold for H. That is, every unicyclic graph has a homomorphism 
to H, and the il-homomorphism problem has a coarse threshold. 

First note that a triangle is not homomorphic to any trianglc-frcc graph. Also, Lemma 4(b) says that 
every unicyclic graph is homomorphic to a triangle. So "every unicyclic graph has a homomorphism to 
H" is equivalent to "iJ contains a triangle" . 

Since the if-homomorphism problem has a coarse threshold, there is some component Hi of H, such 
that the iJ^-homomorphism problem has a coarse threshold. By Theorem 5, Hi has no triangle. 

Let H^ he the subgraph of H which consists of all triangle- free components and H^ be the subgraph 
consisting of the remaining components of H; we have argued that neither H^ nor i?^ is empty. Since 

each component Hi of H^ has a triangle, Theorem 5 implies that the iJ, -homomorphism problem has 
a sharp threshold. Therefore, the i/^-homomorphism problem has a sharp threshold. 

Suppose that H^ is 3-colourable. Then every graph homomorphic to H^ is also homomorphic to a 
triangle, and hence is homomorphic to iJ^. It follows that being homomorphic to H is equivalent to 

being homomorphic to H^ . But this contradicts the facts that ff- homomorphism has a coarse threshold 
and iJ^-homomorphism has a sharp threshold. Therefore x{H^) > 3. 

Also, we know that _ff ^-homomorphism has a coarse threshold since H^ is triangle-free. 

Let C3(n) denote the 3-colourability threshold, and let c'(n) be the iJ^-homomorphism threshold. 
Every 3-colourable graph is homomorphic to a triangle and thus is homomorphic to H^. Therefore 
c'(n) > C3(n). For every c < c'(n), Gn,p=c/n a-s- has a homomorphism to H^ and hence to H. Since 



10 



if-homomorphism has a coarse threshold, there must be some c = c(n) > c'(n) for which Gn,p=c/n is 
not a.s. non-iJ-homomorphic. With probability bounded away from zero, the non-giant components 
of Gn^p=c/n are trees and hence arc homomorphic to and . Therefore a.s. the giant component 
of G„,p=c/n is not if^-homomorphic as otherwise Gn^p=c/n would not be a.s. non-iJ-^-homomorphic. 
Therefore the giant component must not be a.s. non iT^-homomorphic as otherwise G„,p=c/n would be 
a.s. non-_ff-homomorphic. Therefore Gn^p=c/n is not a.s. non-_ff^-homomorphic. 

Replacing H"^ by a triangle has the effect of replacing c'(n) by c^in). Since this does not increase 
c'(n), we still have some c* = c*(n) > c'(n) for which Gn^p=c*/n is not a.s. non-i7^-homomorphic. If 

is disconnected, add some edges so that it is connected but remains triangle-free; call the resulting 
subgraph (H^)', and call the resulting graph, i.e. {H^)' plus a triangle component, H'. Any graph that 
is i7^-homomorphic is (i7^)'-homomorphic and so Gn,p=c* jn is not a.s. non-(i?^)'-homomorphic. Since 

is triangle-free, Theorem 5 implies that (i7^)'-homomorphism has a coarse threshold. The range of 
this threshold either includes c*(n) or lies higher than c*(n); either way, it contains a range of values 
that is higher than C3(n). In that range of values of c, Gn^p=c*ln a-S- has no homomorphism to the 
triangle component of W . It follows that that range of values lies within the range of a coarse threshold 
for B' . 

Therefore, H' has a triangle but if '-homomorphism has a coarse threshold and so violates Hypothesis 

A. 

So the question of whether there is any undirected graph H for which the /f-homomorphism problem 
violates Hypothesis A is equivalent to the following: 

Question 12 Is there any triangle-free graph Hi with x(-ffi) > 3 such that for some values of n and 
some c > C3(n), Gn,p=c/n f^ot a.s. non-Hi-homomorphic, where C3(n) is the threshold value of 
3-colorability? 

3.2 Directed Graphs 

Earlier in this section, we discussed whether there exist any connected graphs H for which the H- 
homomorphism problem violates Hypothesis A. Now we turn our attention to directed graphs. We 
provide an example of a disconnected directed graph H which comes close to violating Hypothesis A. 
It has the properties: 

1. every unicyclic digraph has a homomorphism to H, and 

2. the if-homomorphism problem under the CSPn^p{V) model has a coarse threshold. 

This does not actually violate Hypothesis A. The coarse threshold is under the CSPn,p{V) model 
rather than the CSPn,p{V) model. 

For a directed graph D, let D denote the undirected graph that is obtained from D by removing 
the directions from the edges and then replacing each double edge by a single edge. We define Dn,p 
to be the random digraph on n vertices where each of the n{n — 1) potential directed edges is present 
with probability p. Thus -D„,p possibly contains both edges uv and vu for some pair of vertices v, u, 
i.e. a 2-cycle; in fact, if p = c/n for a constant c, then it is straightforward to show that the probability 
that D contains at least one 2-cycle is ^ + o(l) for some constant ( = C(c) < 1. This is the reason that 
we need to use the CSPn,p{V) model rather than the CSPn,p{'P) model; the digraphs formed by the 
GSPn.p(P) model cannot have any 2-cycles since they cannot have more than one constraint on the 
same pair of variables. 

H consists of a specific connected digraph Hi, defined below, and a pair of vertices Ui,U2, where 
the edges Ui,U2 and U2, Wi are both present. There are no edges between {ui,U2} and Hi. Hi has the 
following properties: 
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(i) : every miicyelic digraph which does not contain a 2-cycle a.s. has a homomorphism to Hi; and 

(ii) : For some ci > 1/2, Dn^p=ci/n is not a.s. non-ifi-homomorphic. 

It is easy to see that any unicychc digraph, whose cycle is a 2-cycle, has a homomorphism to the 
2-cycle. By (i), every other unicyclic digraph has a homomorphism to Hi. Thus, every unicyclic digraph 
has an iJ-homomorphism, as claimed. We will show that, for every 1/2 < c < Ci, Dn,p=c/n is neither 
a.s. iJ-homomorphic nor a.s. non-iJ-homomorphic. Thus, we have a coarse threshold. Condition (ii) 

above implies the latter, so we just need to prove the former. 

The graph D for D = Dn,p, a.s. has a giant component, since it is equivalent to Gn,p where 
p = 1 — (1 — = — o(^) and 2c > 1. The number of 2-cycles in D are easily shown to have a 
Poisson distribution with constant mean, using the same analysis as in, eg.. Section 3.3 of [26]. Also, 
each edge of D is equally likely to correspond to a 2-cycle in D. Therefore, with probability bounded 
away from zero, D has a 2-cycle which lies in the giant component oi D. It is not hard to see that 
a.s. if D has such a 2-cycle then there is no iJ-homomorphism: That 2-cycle must be mapped onto 
Ui,U2- Since H has no edges between Hi and {ui,U2}, any vertex that can be reached in D from 
that 2-cycle must also be mapped onto wi or U2- So the entire giant component must be mapped onto 
{ui,U2}. It is well-known that a.s. a giant component is not 2-colourablc and hence has an odd cycle. 
(This follows, eg. from the facts that: (i) D is a.s. not 2-colourable and (ii) with probability at least 
some positive constant, all components of D other than the giant one are trees and hence 2-colourable.) 
Thus a.s. D has an odd cycle and no odd cycle can be mapped onto a 2-cycle. Therefore, D is not a.s. 
iJ-homomorphic . 

It remains only to prove the existence of some Hi satisfying (i), (ii). We will choose Hi to be a 
tournament (i.e. for every pair of vertices, exactly one of the possible edges between them is present) 

which contains every loopless 2-cycle-free directed graph on ko vertices as a subgraph where fco is a 

constant defined below. This can be done trivially if \Hi\ > ko ; simply place each tournament on 
ko vertices on a different set of vertices of Hi. 

For an undirected graph G which does not contain any multiple edges, the oriented chromatic 

number Xo of G is the minimum number k such that every loopless 2-cycle-free directed graph D 
satisfying D = G \s homomorphic to a loopless 2-cycle-free directed graph H with at most k vertices. 
The acyclic chromatic number of a graph G is the least integer k for which there is a proper coloring of 
the vertices of G with k colors in such a way that every cycle of G contains at least 3 different colors. 
It was proved in [33] that if the acyclic chromatic number of a graph G is at most k, then its oriented 
chromatic number is at most k x 2*^"^. We also need the following: 

Lemma 13 There exists a number c > 1 such that a.s. the acyclic chromatic of Gn,p=c/n most 5. 

Proof. Let G = Gn,p- A pendant path in G is a path in which no vertices other than the endpoints 
lie in any edge of the graph off the path. It was proven in [29] (see the proof of Lemma 6) that there 
exists c > 1 such that a.s. after removing the internal vertices of pendant paths of length at least 4 
from G every component is either a tree or it is unicyclic. One can use 3 colors to color the vertices in 
these components and then use 2 other colors to color the removed vertices such that every cycle in G 
is colored by at least 3 colors. ■ 

When D = Dn^p=ci/n every edge is present in D with probability 2p — p^ and independent of the 
other edges. Thus, if ci = c/2, where c is the constant obtained from Lemma 13, a.s. the acyclic 
chromatic number of D is at most 5 and so Xo{D) < 5 x 2"^ = fco. Therefore, a.s. if D is 2-cycle-free 
then D is homomorphic to some loopless 2-cycle-free digraph H on at most ko vertices. Since every 
such H is a subgraph of Hi, this would mean D is homomorphic to Hi. Since D does not a.s. have a 
2-cycle, this establishes that Hi satisfies (ii). 
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4 The (d, A;, t)-model 



In this section, wc prove Theorem 3 which says that the {d, k, t)-niodel has a sharp threshold whenever 
d,k > 2 and 1 < t < d''~^; i.e. whenever d, k, t are not such that the model is a.s. unsatisfiable for all 
c> 0. 

Proof of Theorem 3 Suppose that the {d, k, t)-model exhibits a coarse threshold. Then consider 

p, a, e and M as guaranteed by Corollary 7. As in the proof of Theorem 5, wc will find some hypcrtree 
T such that adding T to the random CSP increases the probability of unsatisfiability by a constant. 
This time, adding T may not boost the probability as much as M does; but it will boost it by a small 
constant, and that will be enough. 

Gent et.al.[19] proved that for fc = 2: if t < d''~^ and M is unicyclic, then M is satisfiable (see their 
Theorem 1 and Corollaries 1 and 2). Their argument easily extends to the case where fc > 3. So we 
can assume that M is satisfiable. Suppose that V{M) = m, ...,Ur and let at be the value of Ui in some 
particular satisfying assignment A of M. Given a CSP F on at least r variables, we define F © ^ to be 
the CSP formed by choosing a random ordered r-tuple of variables wi, w,. in F and for each 1 < i < r, 
forcing Vi to take the value Ui by adding a one- variable constraint on Vi. Clearly the probability that 
CSPn,p{V)®A is unsatisfiable is at least as high as the probability that CSPn,p{V)®M is unsatisfiable. 

Lemma 8 implies that there exists some constant L such that, defining Ei to be the event that 
"every Vi has at most L neighbours and no hyperedge contains two Vi,Vj" , Pr{Ei) > 1 — f . Suppose 
that El holds. 

Consider a particular Vi and expose the £ < L edges containing it, ei, ...,e£ and the corresponding 
constraints Ci, Ce. For each Cj, let Cj be the (fc— l)-variable constraint obtained by restricting Vi to 
be Oj; i.e. a (fc — l)-tuple of values is permitted for iff Cj permits that same (fc — l)-tuple along with 
Vi = Gi. Thus, we can remove Cj and add the constraint Cj on the fc — 1 other variables. After doing 
so for every Cj, we can remove the restriction that Vi = Ui, since Vi no longer lies in any constraints. 
(Note that, since Ei holds, no constraint will contain some pair Vi,Vj and thus be reduced twice.) 

It is useful now to consider choosing CSPn,p{'P) ® A by first selecting the random variables vi, Vr 
and then choosing CSPn,p{P)- Thus, carrying out the operation described in the previous paragraph 
is equivalent to, for each w^: expose £, choose £ random (fc — l)-tuples of variables from V — {vi, Vr}; 
for each selected (fc — l)-tuple, choose a random Cj with exactly t restrictions and place Cj on the 
(fc — l)-tuple; then remove Note that since Cj has t restrictions, Cj has at most t restrictions. 

For convenience, we modify the experiment in a manner that increases the probability of unsatisfi- 
ability. For each Vi, instead of randomly exposing £, we simply assume £ = L; i.e. we choose L random 
(fc — l)-tuples. Next, if Cj has fewer than t restrictions, we add more restrictions to it so that it has 

exactly t. restrictions. Our final modification is that instead of picking a random (fc — l)-tuple and 
randomly selecting Cj as described above, we choose ^ ) random (fc — l)-tuples and place each of 

the J ) possible constraints on fc — 1 variables and with t restrictions on one of the (fc — l)-tuples. 
The last modification may appear to be a bit of an overkill, but it has the (minor) convenience that 
the added constraints are not randomly selected. Finally, we wish to do without the fact that ui, ...,Vr 
are not permitted to be selected as members of the (fc — l)-tuples. So we choose the (fc — 1) tuples 
randomly from amongst all vertices and let E2 be the event that none of them use any of the vertices 
in vi, ...,Vr. Since r = 0(1) and we are choosing a total of 0(1) (fc — l)-tuples, Pr(£'2) = 1 — 0{n~^). 

So we let G be a random CSP formed as follows: start with a random CSPn,p{T') and then for 
each of the ^ ) possible constraints on fc — 1 variables and with t restrictions, choose rL random 
ordered (fc — l)-tuples of variables and place that constraint on them. Pr(G is unsatisfiable IF2) > 
Pr{CSPn,p{'P)®A is unsatisfiable , as described in the preceding paragraph. Since Pr(£J2) = 1 — o(l), 

jk — 1 

this implies that adding those ( ^ )rL constraints boosts the probability of unsatisfiability by at least 
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2a — 0(1). We say that a collection of C ^ )rL {k — l)-tuples is bad if adding the constraints to 
that set results in an unsatisfiable CSP. So, consider the following random experiment: pick a random 

jfc — 1 

CSPn,p{'P) and then pick ( ^ )rL ordered (A; — l)-tuples of the variables. The probability that we pick 

jk — 1 

a bad collection is at least 2a. Since ( ^ )ri = 0(l), a simple first moment calculation shows that 
a.s. the choice of {k — l)-tuples will be vertex disjoint. Thus, the probability of picking a bad collection 
is at least 2a — o(l) even if we condition on the {k — l)-tuples being vertex-disjoint. 

Now we are finally ready to define our hypertree T as follows: (i) take the hypergraph consisting of 
a vertex v lying in rL(^^ ) edges where no other vertex lies in more than one of the edges (i.e. a star), 

and (ii) place each of the (f'^ ) possible {d, k, t)-constraints on rL of the edges. This, of course, is where 
we take advantage of the fact that we are using the {d, k, i)-model. 

Now consider adding a copy of T to CSPn,p{V)- For each 1 < 5 < d, let Ts denote the collection of 
{k — l)-tuples obtained by removing v from every edge of T that contains a constraint in which every 
restriction has v ^ 5: note that Ts consists of rL copies of every constraint on fc — 1 variables with 
exactly t restrictions and so jT^j = ^ ^ )rL. The probability that for each 1 < (5 < d, is a bad set is 
at least {2a — o{l)Y. Note that if every T5 is a bad set, then the resulting CSP is unsatisfiable because 
setting V = 5 requires the set of {k — l)-constraints on Tg to be enforced. 

So by Lemma 9, there is some z = o{in}~^) such that the probability that C SPn.p=p+z{'P) is 
unsatisfiable is at least (2a — o{l)Y. By considering adding x copies of T, we see that the probability 
that CSPn,p=p+xz{P) is satisfiable is at most (1 — (2a— o(l))'')^ which is less than a for some sufiiciently 
large constant x. Since z = o(n^~'^), this implies that Pr(C'S'P„_(i+e)p('P) is unsatisfiable) > 1— a which 
contradicts Corollary 7(b). □ 



5 Binary CSP's with domain size 3 

Recall that the arguments from Istrate[25] and from Creignou and Daude[10] can show that when the 

domain size d = 2, then Hypothesis A holds; i.e. if every unicyclic CSP is satisfiable, then CSPn.p{V) 
has a sharp threshold. This result does not extend to d = 3. Consider the following example, with 
d = 3,A; = 2: 

Example 14 We have two constraints. Ci says that either both variables are equal to 1, or neither is 
equal to 1. C2 says that the variables cannot both have the same value. ■P(Ci) = |,'P(C2) = 5- 

Observe that every unicyclic CSP that uses only constraints Ci , C2 is satisfiable. 

Consider any | < c < 3. Thus, a.s. the sub-CSP formed by the Ci constraints has a giant 
component, and the sub-CSP formed by the C2 constraints does not. We will show that C5P„^p('P) is 
neither a.s. satisfiable nor a.s. unsatisfiable. 

To see that it is not a.s. unsatisfiable. note that the subgraph induced by the C2 constraints is 
2-colourable with probability at least some positive constant. This follows from the well known fact[13] 
that for c < 1 the random graph G{n, ^) is a forest with probability at least some positive constant. If 
it is 2-colourable, then we can satisfy all the C2 constraints by assigning every variable either 2 or 3; 
this will not violate any Ci constraints. 

To see that it is not a.s. satisfiable, note that the subgraph formed by the Ci constraints has a giant 
component T. So either every variable in T is assigned 1 or none of them are. A.s. at least one C2 
constraint has both variables in T, and so they cannot both be assigned 1. Thus, a.s. no variables in T 
can be assigned 1. This implies that if the C2 constraints form an odd cycle using variables of T then 
the CSP is not satisfiable. That event occurs with probability at least some positive constant, because 
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the graph formed by the C2 constraints that use only variables of T is G^, 1/3 where n' = \T\ = 0(n) 
and so it is not 2-colourable with probability at least some positive constant. 

It is instructive to look at this example in light of Corollary 7. Here, the subgraph M is a triangle 

whose edges are all C2 constraints. If M appears, then at least one of its variables must be assigned 
the value 1. However, with probability at least some C > 0) the remainder of the CSP has a structure 
implying that if at least one of those variables has the value 1 then some specific set of 0(n) other 
variables all must be assigned the value 1. But a.s. there is a C2 constraint between at least two of 
those variables and thus they can't all be assigned 1. This enables the appearance of M to boost the 
probability of unsatisfiability by at least some positive constant. 

The main result of this section shows that when d = 3 and fc = 2, if Hypothesis A fails on some model, 
then there must be an M as in Corollary 7 whose presence boosts the probability of unsatisfiability for 
essentially the same reason as that in Example 14. 

Before presenting our theorem, we begin with a few preliminaries. To simplify, we will take k = 2. 

Consider a CSP where every constraint is on 2 variables. Suppose there is some constraint on 

variables v, u which implies that if v is assigned 6 then u must be assigned 7; we say that w : 5 — > u : 7. 
Moreover if there is a sequence of variables Vi,. . . ,Vr and values Si, . . . ,Sr such that Vi : 5i Vij^i : 
for i = 1, . . . , r — 1 then we say that vi : 5i ^ Vr ^r- 

For each variable v and each pair of (possibly equal) values 5,7 we define Fs^^{v) to be the set of 

variables u such that u : 5 u : 7, and we define Fs{v) = Ui<.^<,;F,>-.~^(?;). Thus Fs{v) is the set of 
variables u such that if v is assigned 6 then there is a path of constraints which imply that u must 
be assigned a particular value. Assigning 5 to v may force assignments to other variables w via a 
combination of more than one path of constraints. But the locally tree-like nature of C S Pn,p{'P) will 
imply that such variables are not a significant concern. 

In CSPn^piV), we can expose Fs{v) by using a simple breadth-first search from v. This allows us 
to analyze the distribution of the size of Fs{v) and Fs,^{v) using a standard branching-process analysis 
(see e.g. Chapter 5 of [26]). Straightforward branching-process arguments yield: 

Lemma 15 With the exception of at most (P constants c, if p = {c + o(l))/n then for every pair (5,7, 
one of these cases holds: 

(a) Exp(|F5,^(w)|)= 0(1); or 
(h) Exp(|F5,^(t;)|)=e(n). 

We omit the standard proofs. We remark only that those constants are the so-called critical 
points for each F^^^. 

We say that F^^-y is subcritical if case (a) holds and supercritical if case (b) holds. We say that F^ 
is supercritical if Fg^^y percolates for at least one 7, and is subcritical otherwise. Markov's Inequality 
immediately implies: 

Lemma 16 (a) If Fs ^ is subcritical, then for every ^ > there is a constant L such that 
Pr{\FsMv)\ < L)> I ~ t 

(b) If Fs^j is supercritical, then there are constants (,f3 > such that Pr{\Fs^j{v)\ > (in) > (. 

We can now state the main result of this section: 

Theorem 17 Consider some V with d = 2,k = 3 such that CSPn,piV) has a coarse threshold and 
every unicyclic CSP formed from supp('P) is satisfiable. Then there exist p, a, e, M as in Corollary 7 
such that for some value 1 < 6 < 3 we have: 
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(a) M cannot be satisfied using only the two values other than 5; and 

(b) Fs^s is supercritical. 

E.g., in Example 14 we can take M to be a triangle of C2 constraints and S = 1. 

This theorem docs not extend to d = 4, k = 2 nor d — 3,k = 3; in each of these cases, we have 
counterexamples. Before proving our theorem, we start with a helpful lemma. 

Lemma 18 Suppose p = {c + o(l))/n where c is not one of the 9 exceptional constants from Lemma 
15. If is supercritical then either (i) Fg^g is supercritical or (ii) there is som,e 11, such that F^^^ is 
supercritical and there is a sequence of constraints in supp(7') through which v : 6 ^ u : fj.. 

Proof: If Fg^s is supercritical then (i) holds; so assume otherwise. Thus there is some 7 7^ (5 such that 
Fs^~^ is supercritical. Thus there is a sequence of constraints in supp(7^) through which v : d ^ u : ^. 
So if Fj^^ is supercritical then (ii) holds with /x = 7; so assume otherwise. 

Let a be the third value. Consider any vertex v and any constant C > 0. 

Case 1: there is no sequence of constraints in supp('P) through which v : S ^ u : a. Thus 
Fs.aiv) = 0- By Lemma 16(a) there is some constant Li such that with probability at least 1 — C/3, 
\Fs,s{v)\ < Li. If that bound holds, then by Lemma 8, there is a constant L2 such that with probability 
at least 1 — C/3, the set of variables u such that there is some w e Fs^s{v) and some constraint implying 
w : S ^ u : 'J has size at most L2; call the set of such variables X. If that second bound holds, 
then applying Lemma 16(a) again, there is a constant L3 such that with probability at least 1 — C/3, 
I Uxsx F^^^{x)\ < L3. Therefore, with probability at least 1 — C) 1-^5,7(^)1 < -^3) since this holds for 
every C > 0, this contradicts Lemma 16(b) and the fact that Fg.^ is supercritical. 

Case 2: there is a sequence of constraints in supp(7-*) through which v : 6 ^ u : a. Then the same 
argument that showed Fj^^ is subcritical shows that F^^a is also subcritical. We proceed as in Case 1: 

By Lemma 16(a) there is some constant Li such that with probability at least 1 — (/5, \X\ < Li. If 
that bound holds, then by Lemma 8, there is a constant L2 such that with probability at least 1 — C/5, 
the set of variables u such that there is some w G Fs^siv) and some constraint implying w : S ^ u : a 
has size at most L2; call the set of such variables Xi. If that second bound holds, then applying Lemma 
16(a) again, there is a constant L3 such that with probability at least 1 — C/5, \ UxeXi Fa.aix)\ < L3. 
If that bound holds, then again by Lemma 8, there is a constant L4 such that with probability at 
least 1 — C/5, the set of variables u such that either (i) there is some w € Fs^s{v) and some constraint 
implying w : (5 — > u : 7 or (ii) there is some w 6 U {UxeXiFa,a{x)) and some constraint implying 
w : a ^ u : "/ has size at most L4; call the set of such variables X2. Finally, if all those bounds hold, 
then applying Lemma 16(a) again, there is a constant such that with probability at least 1 — C/5, 
I UxeX2 F'),j{x)\ < L5. Therefore, with probability at least 1 — Ci I-P^.tI'^)! < -^sJ since this holds for 
every C > 0, this contradicts Lemma 16(b) and the fact that Fs^j is supercritical. □ 

We are now ready to prove our theorem. 

Proof of Theorem 17: Suppose that CSPn,p {P) has a coarse threshold and consider M, e, a,p = 

p{n) from Corollary 7. li p = (c + o(l))/n where c is one of the 9 exceptional constants from Lemma 15, 
then we can increase p by some small e'/n, and decrease e slightly, so that the conditions of Corollary 7 
still hold. This allows us to apply Lemmas 15, 18. 

As defined in [31], a value 1 < 5 < 3 is bad if there is a constraint in supp('P) which forbids a 
variable from receiving 5; i.e. if there is a constraint C which contains the restrictions ((5, 1), ((5, 2), ((5, 3) 
or the restrictions (1, (5), (2, (5), (3, 5). A value 5 is also said to be bad if there is a sequence of constraints 
in supp('P) joining variables u,v for which u : 5 — > u : 7 where 7 is a bad value. It is easy to see 
that if there is a unicyclic CSP M formed from the constraints of supp(P) such that every satisfying 
assignment to M uses at least one bad value, then M can be modified to an unsatisfiable unicyclic CSP 
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M' formed from supp(7^): we simply attach paths of constraints to each variable of M which forbid 
those variables from receiving any bad values. (See [31] for the details.) Therefore, if every unicyclic 
CSP formed from supp('P) is satisfiable, then every such CSP can be satisfied without using any bad 
values. Thus, M can be satisfied without using any bad values. 

Case 1: There is a value 5 such that (i) every satisfying assignment of M must use 5 or a bad value 
on at least one variable and (ii) Fs is supercritical. 

If there arc any bad values, then the above construction produces a unicyclic M' D M such that 
every satisfying assignment of M' must use S on at least one variable. (Otherwise, set M' = M.) If 
Fi^s is supercritical then M' ,5 satisfy Theorem 17. Otherwise, by Lemma 18, there is a value \i ^ b 
such that i^p.p is supercritical and there is a sequence of constraints so that v : S —>■ u : fj,. Attaching 
that sequence to every variable of M' yields a unicyclic CSP M" for which every satisfying assignment 
must use on at least one variable. Thus M",fj. satisfy Theorem 17. 

Case 2: There is a satisfying assignment A of M in which every value 5 used is such that 5 is 
not bad and Fg is not supercritical. Suppose that M has r variables xi, ...,Xr and that A assigns 
to Xi. Recall from Section 4 that CSPn^p{V) © A is formed by taking CSPn^piV) and then choosing 
r random variables vi,...,Vi and adding one- variable constraints that force Vi to take a^. Clearly 
Pr(CS'P„,p(P) © A is unsatisfiable) > Pr (CS'P„,p(P) © M is unsatisfiable) . 

Expose F — [J\^^Fai{vi), and t/, the set of variables outside of F that lie in a constraint with a 
variable in F. Since all of the are subcritical, Lemmas 8 and 16(a) imply that there is some L such 
that with probability at least 1 — a/2, \U\ < L and F\JU \s & forest with r trees, one containing each 
Oi. Since adding M to C SPn,p{V) increases the probability of unsatisfiability by at least a, it must be 
that the probability that CSPn,p{V) is satisfiable, \U\ < L, <^\JU is such a forest and CSPn,p{V) © A 
is unsatisfiable is at least a/2. 

Suppose that CSPn,p{V) is satisfiable, \U\ < L and F U L is a forest of r treees, one for each Oj. 
Note that the forest structure of F U t/ implies that every variable whose value is determined by the 
assignment A lies in F. Consider some u € U sharing a constraint with w G F where A forces w to 
take the value /U. Let fl = fl{u) be the set of values which can be assigned to u which, in conjunction 
with assigning ji to w do not violate their constraint. We know that |r2| 7^ since otherwise /i is a bad 
value and hence some ai is a bad value. We know that |ri| 7^ 1 since otherwise u G F. So |ri(tt)| > 2 
for each u €U. Suppose that u\,...,ue are the variables in U with \ = 2, and let Si be the value not 
in Q,{ui). Consider taking a random CSP formed as follows: first take a CSPn,p{V) and then choose £ 
random variables ui, ...,ue and force u, to not take value 6i using a one- variable constraint. We have 
proved that the one- variable constraints boost the probability of unsatisfiability by at least a/2. 

We have now reached a stage where the rest of the proof is by now standard. We can use the 

techniques in any of [16, 3, 18] to show that if £ 1-variablc constraints, each of the form cannot 
receive 6i" boost the probability of unsatisfiability by at least a/2, then so does the addition of some 
constant number of additional random constraints. This will lead to a contradicton of Corollary 7(c). 
We will take the most concise of these techniques, the one from [18] (which was proposed by Alon). 
The main tool is a variant of a theorem of Erdos and Simonovits [14], as stated in [17]: 

Lemma 19 For all positive integers fc, £ and real < 7 < 1, there exists 7' > such that for sufficiently 
large n, if H C. [n]^ is such that \H\ > jn^ then with probability at least 7' a random choice of i k-tuples 
of integers between 1 andn: (t;]^, ),..., (w]^, w^) yields a complete £-partite system of elements of 
H; i.e. for every function f ■ [i] —>■ [k], the i-tuple {v(^^\ ...,vj^^^) G H. 

To apply this lemma, we set A: = 2 and we let H be the set of bad £-tuples of variables in CSPn,p{V); 
i.e. those ^-tuples v\,...,V(, such that if we forbid each v\ from receiving 5i then the CSP will be 
unsatisfiable. We have shown that choosing a random £-tuple vi,...,vi and forbidding each Vi from 
receiving 5i boosts the probability of unsatisfiability by at least a/ 2. That random choice will only 
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make the CSP unsatisfiable if we choose a bad f-tuple of variables. Therefore, \H\ > {a/2)n^, and so 
Lemma 19 applies to H. 

Now suppose that instead of adding those £ 1-variable constraints to CSPn,p{V), we instead add £ 
new random constraints selected according to V; call these constraints Ci, Q. For each value S, there 
is at least one constraint Cg £ supp(7^) which does not allow both variables to recieve 5; otherwise 
CSPn,p{'P) would be trivially satisfiable by setting every variable equal to i5. The probability that for 
each 1 < i < i, Ci = Cg^ is 0^=1 ^(^iSi) = C > 0. If this event occurs, then we can treat each Ci as 
a pair (i.e. 2-tuple) of variables at least one of which cannot take the value S^. By Lemma 19, with 
probability at least some 7' > this collection of i pairs forms a complete ^-partite system of elements 
of H. If so, then the resulting CSP is unsatisfiable: To sec this, consider any satisfying assignment and 
for each 1 < i < £, set to be a variable of Ci which does not have the value Si. Then the ^-tuple 
{^{1), (f>{^)) is a member of H and hence is bad. Thus there is no satisfying assignment in which each 
(p{i) does not receive Si. 

Therefore, adding £ random constraints increases the probability of unsatisfiability by at least C7' > 
0. So adding a sufficiently large constant number of additional random constraints will boost it by 
arbitarily close to 1. Increasing p by e/n will a.s. result in at least that many extra constraints. So this 

contradicts Corollary 7(c). This establishes Claim 4 and hence our Lemma. □ 

We close this section by noting why this proof cannot be extended to general d. The problem is that 
possibly some of the variables in U would have their domain sizes reduced by two instead of one and so 

some of the 1-variablc constraints would be of the form cannot receive Si or 7^" . This would prevent 
us from using Lemma 19 and any of the other known techniques for establishing sharp thresholds. 

6 Future Directions 

There is clearly much work still to be done along these lines of research. The big problem still remains 
- determine precisely which models from [31] have a sharp threshold. Of course. Section 2 indicates 
that this may be overly ambitious. In the example of Section 2, supp(7') is disconnected in that the 
values can be partitioned into two parts (namely {1, 2, 3} and {4, 5} such that no constraint permits its 
variables to take members of different parts. In [29], it was noted that when supp(7') is disconnected 
CSPn,p{V) can behave strangely. So perhaps it is more feasible to determine precisely which models 
with supp(7') connected have sharp thresholds. An important subgoal would be to do this for binary 
CSP's, i.e. the case where k = 2. Another reasonable goal to pursue would be the d = 3 case. 

As far as more specific classes of models go, one should try to extend the work in Section 3 and 
examine whether Hypothesis A holds for if-homomorphism problems when if is a directed hypergraph. 
Such homomorphism problems are equivalent to CSP's in which every constraint is identical under some 
permutation of the variables. Of course, we showed in Section 3.2 that this is not always true in the 
CSPn^piV) model. But there is a chance that it is true for the CSPn^p{V) model. Also, the example in 
Section 3.2 is not connected. So perhaps Hypothesis A holds for i?-homomorphism problems whenever 
H is a. connected directed hypergraph. Or perhaps one needs to require that H is strongly connected. 
And of course, it would be good to determine whether the "connected" condition can be removed from 
Theorem 5 by answering Question 12. 
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